Psychology as a Science
What do we mean by “probability”?
It might seem like there’s an easy answer to this question, but there’s at least three senses of probability.
These different senses after often employed in different contexts, because they make more sense in some contexts and not others
The three I’ll cover are:
The classical view of probability
The frequency view of probability
The subjective view of probability
The classical view is often used in the context of games of chance like roulette and lotteries
We can sum it up as follows:
If we have an (exhaustive) list of events that can be produce by some (exhaustive) list of equiprobable outcomes (the number of events and outcomes need not be the same), the the probability of a particular event occurring is just the proportion of outcomes that produce that event.
To make it concrete we’ll think about flipping coins. If we flip two coins the possible outcomes that can occur are:
HH, HT, TH, TT
If we’re interested in a particular event—for example, the event of “obtaining at least one head from two flips”—then we just count the number of outcomes that produce that event.
HH, HT, TH, TT
Three out of four outcomes would produce the event of “at least one head”, so the probability is \(\frac{3}{4}\) or 0.75
If you’re viewing probability like this, it’s very important to be clear about what counts as a possible outcome.
E.g., When playing the lottery, how many outcomes are there?
Two outcomes? You pick the correct numbers or you don’t? So the the probability of winning is \(\frac{1}{2}\)?
Of course not! There’s 45,057,474 possible outcomes, and 1 leads to you winning with 45,057,473 leading to you not winning!
When you take a frequency view of probability you’re making a claim about how often, over some long period of time some event occurs.
The frequency view is often the view that we take in science. If we wanted to assign a probability to the claim “drug X lowers depression”, we can’t just think of each possible outcomes that could occur when people take Drug X and then count up how many lead to lower depression and how many do not.
No way to make an exhaustive list of every possible outcome!
But we can run an experiment where we give Drug X and see whether it lowers depression. And we can repeat this many many times. Then we count up the proportion of experiments in which depression was lowered.
That is then the probability that Drug X lowers depression.
Consider the following statement:
The England cricket team will lose the upcoming test series against South Africa
There is a sense in which you can assign a probability to this
But it isn’t the classical kind—we can’t just enumerate all the possible outcomes that lead to this event
Nor is it the frequency kind—we can’t repeat the 2020/2021 cricket tour over and over and see how often England lose.
When we talk about probability in this context mean something like degree of belief, credence, or subjective probability.
Probability in this context is the answer to the question “how sure are you that the England cricket team will lose the upcoming test series against Australia?”
The different views of probability have got to do with what the numbers mean, but once we have the have the numbers there’s no real disagreements about how we do calculations with those numbers1
Some properties of probabilities
When we attach numbers to probabilities those numbers must range from 0 to 1
If an event has probability 0 then it is impossible
If an event has probability 1 then it is guaranteed
These two simple rules can help us to check our calculations with probabilities. If we get a value more than 1 or a value less than 0, then something has gone wrong!
.footnote[1Probabilities don’t always have to have numbers attached. There is a sense in which something can be more probably than something else with numbers being attached.]
Whenever two events are mutually exclusive:
The probability that at least one them occurs is the sum of the their individual probabilities
If we flip a coin, one of two things can happen. It can land Heads, or it can land Tails. It’s can’t land heads and tails (mutually exclusive), and one of those things must happen (it’s a list of all possible events)
What’s the probability that at least one of the those events happens? Since one of those events must happen the probability must be 1
But we can work it out from the individual probabilities
\(\frac{1}{2}\) possible outcomes produces Heads—P(Heads) = 0.50
\(\frac{1}{2}\) possible outcomes produces Tails—P(Tails) = 0.50
The probabilities of at least one of Heads or Tails occurring is 0.5 + 0.5 = 1
Consider a deck of cards:
What is the probability pulling out a Spade or a Club?
What is the probability of pulling out a Spade or a Ace
In situation (1) the events are mutually exclusively or disjoint. A card can’t be a Spade AND a Club. It will either be a Space, a Club, or something else. The addition rule applies: - P(Spade) + P(Club) = Probability of selecting a spade or a club.
In situation (2) the events are not mutually exclusive. A card out be both a Spade and an Ace. - So we need a different rules
To make this clear, we’ll take a look at an example
]
.pull-right[ ]In the last example we asked about the probability of selecting a red circle or a circle with a white dot
In this example we’re dealing with an event that only produces one outcome.
But we can make things more complex so that we’re dealing with events that produce multiple outcomes.
There’s a few different scenarios that can happen when we’re dealing with multiple events, so we’ll start simple and then get more complex…
Here our event can produce outcomes such as:
Selecting a blue circle and a red circle
Selecting a green circle and yellow circle etc
We can also just count when we’re dealing with multiple events
But this is often easier to do when we draw probability trees
In the previous example, the two choices were independent
This just means that the outcome of either choice doesn’t influence the probability of the other choice
That is, we can calculate the probability of each event without considering anything about the other event
When this is the case, we can calculate the probabilities of both events occurring just by multiplying the two probabilities
But sometimes this isn’t the case… sometimes the probability of a second event is dependent on the first event
Let us look at a simple example…
In these examples, rather than selecting two circles we’ll ask about the probability of a single circle having two features
.pull-left[ ]
.pull-right[ ]
In the last example I introduced the idea of a conditional probability
We knew P(Blue): The probability of a circle being blue independently of whether it had a dot or not
And P(Dot): The probability of a circle having a dot independently of whether it was red or not
But to answer our question we needed to know a conditional probability
P(Blue) × P(Blue|Dot): \(\frac{20}{35} \times \frac{15}{20} = \frac{15}{35}\)
We could of also done it the other
P(Dot) × P(Dot|Blue): \(\frac{20}{35} \times \frac{15}{20} = \frac{15}{35}\)
Or we could just count the number of circles (out of all the circles) that are both Blue and have a white dot
This example makes it seem like conditional probability is pretty easy (it’s all about counting the correct circles!), but it can be tricky to understand!
Although it didn’t seem like it, the first example with the two sets of circles also involved conditional probabilities
However, P(Green) was equal to P(Green|Yellow) and P(Yellow) was equal to P(Yellow|Green)
The probability of picking Green didn’t change given that we’d already picked Yellow
And the probability of picking Yellow didn’t change given that we’d already picked Green
This is the mathematical definition of independence
In our blue circle white dot example we saw that P(Blue|Dot) and P(Dot|Blue) were equal
P(Blue|Dot) = \(\frac{15}{20}\)
P(Dot|Blue) = \(\frac{15}{20}\)
But conditional probabilities and their inverse are not always equal
P(Red|Dot) = \(\frac{5}{20}\)
P(Dot|Red) = \(\frac{5}{15}\)
.pull-left[ ] .pull-right[ ]
There’s a mathematical formula that related P(A|B) to P(B|A). This formula is know as Bayes theorem.
Bayes theorem is very useful for thinking about conditional probabilities, because conditional probabilities can sometimes be incredibly unintuitive
Consider the following example:
There is a test for an illness. The test has the following properties:
About 80% of people that actually have the illness will test positive
Only ~5% of people that don’t have the illness will test positive
Somebody, who may be sick or healthy, takes the test and tests positive…
Is that person actually sick?
.pull-left[ ]
The probability of the person actually being sick depends on the incidence of the disease.
If the disease is rare then there’s a low probability that the person is actually sick
If the disease is common then there’s a high probability that the person is actually sick
We can work out the answer to the previous question just by counting the dots, but we can also use Bayes theorem.
Bayes theorem is given as:
$$P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}$$
or
$$P(🤮\ |\ ✅) = \frac{P(✅\ |\ 🤮) \times P(🤮)}{P(✅)}$$
and when we put numbers to it…
$$\frac{4}{9} = \frac{\frac{4}{5} \times \frac{5}{100}}{ \frac{9}{100} }$$
Note that the crucial values here are P(🤮) and P(✅ ). These are sometimes referred to as the prior probabilities or unconditional probabilities.
If you change the value of P(🤮) then you’re changing how rare or common the disease
Reasoning about conditional probabilities like the testing example can be difficult between people often forget about the P(🤮) and P(✅ ) bits.
But we ignore P(🤮) and P(✅ ) we can see it’s easy to make mistakes!
Another common error is to confuse P(🤮|✅) and P(✅|🤮) or to think
P(🤮|✅) = P(✅|🤮)
But we saw from our earlier example that this isn’t the case
The media and (the scientific literature) is unfortunately littered with examples of people getting muddled with conditional probabilities
And some of these confusions can actually be dangerous!
I’ll just pick out two more examples to finish on…
You might have heard the following statistic in the media/online
50% of people die from Covid have been vaccinated
I’ve seen this stat on social media along with the claim that it shows that the Covid vaccine doesn’t work
Let’s assume the stat is accurate. Does what does this mean the vaccine doesn’t work?
.pull-left[ ]
.pull-right[ ]
In this example we’re keeping the vaccine efficacy constant (50% chance of dying in not vaccinated and 10% chance of dying if vaccinated). The vaccinate doesn’t prevent all deaths, but it does offer protection!
A few years ago Johnson et al published a study (in a very prestigious journal) about racial bias in police shooting.
Their finding can be summed up as follows:
There is no racial bias in police shootings because people shot by police are more likely to be White than Black
This was picked up by the conservative media (e.g., Fox news) to show that organisations like BLM were fighting against a problem that didn’t exist!
But is the reasoning correct, and do the data show what Johnson et al claim?
NO!
Johnson et al, the journal reviewers, the journal editors1, and the media were looking at whether P(Black|Shot) was larger than P(White|Shot) when they should’ve been looking at whether P(Shot|Black) was larger than P(Shot|White)
.footnote[1This paper has now been retracted from the journal after a campaign that started on twitter, but the damage is maybe already done]
Let’s firt look at the data1 Johnson et al present
.center[
]
In graphic shows the people shot by the police.
The probablity that a person is White (P(White|Shot)) is \(\frac{20}{30}\) or 66.67%
The probablity that a person is Black (P(Black|Shot)) is \(\frac{10}{30}\) or 33.33%
These are the two probabilities that Johnson et al look at.
.footnote[1These aren’t the actual data, but I’m simplified it to make things easier]
But let’s add some additional data. These are the people that have had encounters will police that didn’t end in a shooting
.center[
]
Jonson et al didn’t report this data, so I’m made this up for illustration
We need this data, because instead of looking at P(Black|Shot)/P(White|Shot) we need to look at P(Shot|Black) and P(Shot|White)
Putting the it all together we see this:
.center[
]
These are all the encounters that occurred between the police and civilans including those that ended in the police shooting a civilans and those that did not.
Now we condition on being Black
.center[
]
This allows us to see P(Shot|Black), which gives \(\frac{10}{20}\) or 50%
And condition on being White
.center[
]
Which allows us to see P(Shot|White), which gives \(\frac{20}{80}\) or 25%
These data, at least, suggest there is a bias in Police shooting
Unfortunately, Johsnon et al didn’t collect the data they need to draw their collusions
.pull-left[.center[
]]
.pull-right[.center[
]]
Both figures above as consistent with P(Black|Shot) = 0.33 and P(White|Shot) = 0.67
But one give P(Shot|Black) = 0.5 and P(Shot|White) = 0.25
And the other gives P(Shot|Black) = 0.14 and P(Shot|White) = 0.67
Either one of these could1 be the case, but Johnson et al’s data can’t tell us this and therefore, they have absolutely no basis to support their claim
.footnote[1Give the racial makeup of the US the first seems more plausible than the second]